Non-linear predictive vector quantization of feature vectors for distributed speech recognition

نویسندگان

  • José Enrique García Laínez
  • Alfonso Ortega
  • Antonio Miguel
  • Eduardo Lleida
چکیده

In this paper, we present a non linear prediction scheme based on a Multi-Layer Perceptron for Predictive Vector Quantization (PVQ-MLP) of MFCC for very low bit-rate coding of acoustic features in distributed speech recognition (DSR). Certain applications like voice enabled web-browsing or speech controlled processes in large industrial plants, where hundreds of users access simultaneously to the same ASR server can benefit from this substantial bit-rate reduction. Experimental results obtained on a large vocabulary task show an improved performance of PVQ-MLP in terms of prediction gain and WER compared to a linear prediction scheme, especially when low bit-rates are evaluated. Using PVQ-MLP the bit-rate can be reduced up to 1.8 kbps resulting in a reduction of 66% with respect to the ETSI standards (4.4 kbps) with a WER degradation lower than 5% compared to a system without quantization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-linear Compression of Fea Transform Coding and Non-unif

This paper uses transform coding for compressing feature vectors in distributed speech recognition applications. Feature vectors are first grouped together into non-overlapping blocks and a transformation applied. A non-uniform allocation of bits to the elements of the resultant matrix is based on their relative information content. Analysis of the amplitude distribution of these elements indic...

متن کامل

Predictive vector quantization using the M-algorithm for distributed speech recognition

In this paper we present a predictive vector quantizer for distributed speech recognition that makes use of a delayed decision coding scheme, performing the optimal codeword searching by means of the M-algorithm. In single-path predictive vector quantization coders, each frame is coded with the closest codeword to the prediction error. However, prediction errors and quantization errors of futur...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

 In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Low bit-rate feature vector compression using transform coding and non-uniform bit allocation

This paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-overlapping blocks and applying a transformation to give a more compact matrix representation. Both Karhunen-Loeve and discrete cosine transforms are considered. Following transforma...

متن کامل

Differential vector quantization of feature vectors for distributed speech recognition

Distributed speech recognition arises for solving computational limitations of mobile devices like PDAs or mobile phones. Due to bandwidth restrictions, it is necessary to develop efficient transmission techniques of acoustic features in Automatic Speech Recognition applications. This paper presents a technique for compressing acoustic feature vectors based on Differential Vector Quantization. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010